Insights from Russian second language readability classification: complexity-dependent training requirements, and feature evaluation of multiple categories

نویسنده

  • Robert Reynolds
چکیده

I investigate Russian second language readability assessment using a machine-learning approach with a range of lexical, morphological, syntactic, and discourse features. Testing the model with a new collection of Russian L2 readability corpora achieves an F-score of 0.671 and adjacent accuracy 0.919 on a 6-level classification task. Information gain and feature subset evaluation shows that morphological features are collectively the most informative. Learning curves for binary classifiers reveal that fewer training data are needed to distinguish between beginning reading levels than are needed to distinguish between intermediate reading levels.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A novel hybrid method for vocal fold pathology diagnosis based on russian language

In this paper, first, an initial feature vector for vocal fold pathology diagnosis is proposed. Then, for optimizing the initial feature vector, a genetic algorithm is proposed. Some experiments are carried out for evaluating and comparing the classification accuracies which are obtained by the use of the different classifiers (ensemble of decision tree, discriminant analysis and K-nearest neig...

متن کامل

Testing Problems in Russian as a Foreign Language in a Technical University

 Problems of theory and practice of the Russian as a foreign language testing for entrants in technical universities are considered. The benefits of test forms for controlling the foreign students’ skills in the Russian language during a hard time limit are presented. The structure and content of the tests, all types of tasks offered on the entrance and final examinations in the Russian languag...

متن کامل

Single-Sentence Readability Prediction in Russian

In an effort to make reading more accessible, an automated readability formula can help students to retrieve appropriate material for their language level. This study attempts to discover and analyze a set of possible features that can be used for single-sentence readability prediction in Russian. We test the influence of syntactic features on predictability of structural complexity. The readab...

متن کامل

Selection of Foreign Language Teaching Content in Russian Master of Laws (LLM) Graduate Programs

Master`s degree was integrated into the system of Russian Higher Education several decades ago, however, teaching foreign languages at this level still needs further analysis including the postgraduate law students training. The article investigates the principal components of foreign language teaching in Master of laws Graduate Programs (considering the case of the English language) on the bas...

متن کامل

On-The-Fly Translator Assistant (Readability and Terminology Handling)

This paper describes a new methodology for developing CAT tools that assist translators of technical and scientific texts by (i) on-the-fly highlight of nominal and verbal terminology in a source language (SL) document that lifts possible syntactic ambiguity and thus essentially raises the document readability and (ii) simultaneous translation of all SL document oneand multicomponent lexical un...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016